7 research outputs found

    Traitement FrameNet des constructions Ă  attribut de l'objet

    Get PDF
    National audienceWithin the ASFALDA project, which includes the production of a French FrameNet, we try to provide a linguistically motivated treatment for a typical example of syntax-semantics mismatch : object complement construction. In order to do so, we first give an overview of syntactic and semantic properties of object complement constructions. Next, we study the way FrameNet deals with English verbs taking part in those constructions, and finally take a stance for a homogenized treatment of the construction within the French FrameNet.Dans le cadre du projet ASFALDA, qui comporte une phase d'annotation sémantique d'un FrameNet fran-çais, nous cherchons à fournir un traitement linguistiquement motivé des constructions à attribut de l'objet, un exemple typique de divergence syntaxe-sémantique. Pour ce faire, nous commençons par dresser un panorama des propriétés syntaxiques et sémantiques des constructions à attribut de l'objet. Nous étudions ensuite le traitement FrameNet des verbes anglais typiques de cette construction, avant de nous positionner pour un traitement homogénéisé dans le cas du FrameNet français. Abstract. Mots-clés : FrameNet, français, construction à attribut de l'objet, divergence syntaxe-sémantique

    Traitement FrameNet des constructions Ă  attribut de l'objet

    Get PDF
    National audienceWithin the ASFALDA project, which includes the production of a French FrameNet, we try to provide a linguistically motivated treatment for a typical example of syntax-semantics mismatch : object complement construction. In order to do so, we first give an overview of syntactic and semantic properties of object complement constructions. Next, we study the way FrameNet deals with English verbs taking part in those constructions, and finally take a stance for a homogenized treatment of the construction within the French FrameNet.Dans le cadre du projet ASFALDA, qui comporte une phase d'annotation sémantique d'un FrameNet fran-çais, nous cherchons à fournir un traitement linguistiquement motivé des constructions à attribut de l'objet, un exemple typique de divergence syntaxe-sémantique. Pour ce faire, nous commençons par dresser un panorama des propriétés syntaxiques et sémantiques des constructions à attribut de l'objet. Nous étudions ensuite le traitement FrameNet des verbes anglais typiques de cette construction, avant de nous positionner pour un traitement homogénéisé dans le cas du FrameNet français. Abstract. Mots-clés : FrameNet, français, construction à attribut de l'objet, divergence syntaxe-sémantique

    Corpus annotation within the French FrameNet: a domain-by-domain methodology

    Get PDF
    International audienceThis paper reports on the development of a French FrameNet, within the ASFALDA project. While the first phase of the project focused on the development of a French set of frames and corresponding lexicon (Candito et al., 2014), this paper concentrates on the subsequent corpus annotation phase, which focused on four notional domains (commercial transactions, cognitive stances, causality and verbal communication). Given full coverage is not reachable for a relatively " new " FrameNet project, we advocate that focusing on specific notional domains allowed us to obtain full lexical coverage for the frames of these domains, while partially reflecting word sense ambiguities. Furthermore, as frames and roles were annotated on two French Treebanks (the French Treebank (Abeillé and Barrier, 2004) and the Sequoia Treebank (Candito and Seddah, 2012), we were able to extract a syntactico-semantic lexicon from the annotated frames. In the resource's current status, there are 98 frames, 662 frame-evoking words, 872 senses, and about 13000 annotated frames, with their semantic roles assigned to portions of text. The French FrameNet is freely available at alpage.inria.fr/asfalda

    A general framework for the annotation of causality based on FrameNet

    Get PDF
    International audienceWe present here a general set of semantic frames to annotate causal expressions, with a rich lexicon in French and an annotated corpus of about 4000 instances of causal lexical items with their corresponding semantic frames. The aim of our project is to have both the largest possible coverage of causal phenomena in French, across all parts of speech, and have it linked to a general semantic framework such as FN, to benefit in particular from the relations between other semantic frames, e.g., temporal ones or intentional ones, and the underlying upper lexical ontology that enables some forms of reasoning. This is part of the larger ASFALDA French FrameNet project, which focuses on a few different notional domains which are interesting in their own right (Djemaa et al., 2016), including cognitive positions and communication frames. In the process of building the French lexicon and preparing the annotation of the corpus, we had to remodel some of the frames proposed in FN based on English data, with hopefully more precise frame definitions to facilitate human annotation. This includes semantic clarifications of frames and frame elements, redundancy elimination, and added coverage. The result is arguably a significant improvement of the treatment of causality in FN itself

    Developing a French FrameNet: Methodology and First results

    Get PDF
    International audienceThe Asfalda project aims to develop a French corpus with frame-based semantic annotations and automatic tools for shallow semantic analysis. We present the ïŹrst part of the project: focusing on a set of notional domains, we delimited a subset of English frames, adapted them to French data when necessary, and developed the corresponding French lexicon. We believe that working domain by domain helped us to enforce the coherence of the resulting resource, and also has the advantage that, though the number of frames is limited (around a hundred), we obtain full coverage within a given domain

    Domain by domain strategy for creating a French FrameNet : corpus annotationsof semantics frames and roles

    No full text
    Dans cette thĂšse, nous dĂ©crivons la crĂ©ation du French FrameNet (FFN), une ressource de type FrameNet pour le français crĂ©Ă©e Ă  partir du FrameNet de l’anglais (Baker et al., 1998) et de deux corpus arborĂ©s : le French Treebank (AbeillĂ© et al., 2003) et le Sequoia Treebank (Candito et Seddah, 2012). La ressource sĂ©minale, le FrameNet de l’anglais, constitue un modĂšle d’annotation sĂ©mantique de situations prototypiques et de leurs participants. Elle propose Ă  la fois :a) un ensemble structurĂ© de situations prototypiques, appelĂ©es cadres, associĂ©es Ă  des caractĂ©risations sĂ©mantiques des participants impliquĂ©s (les rĂŽles);b) un lexique de dĂ©clencheurs, les lexĂšmes Ă©voquant ces cadres;c) un ensemble d’annotations en cadres pour l’anglais. Pour crĂ©er le FFN, nous avons suivi une approche «par domaine notionnel» : nous avons dĂ©fini quatre «domaines» centrĂ©s chacun autour d’une notion (cause, communication langagiĂšre, position cognitive ou transaction commerciale), que nous avons travaillĂ© Ă  couvrir exhaustivement Ă  la fois pour la dĂ©finition des cadres sĂ©mantiques, la dĂ©finition du lexique, et l’annotation en corpus. Cette stratĂ©gie permet de garantir une plus grande cohĂ©rence dans la structuration en cadres sĂ©mantiques, tout en abordant la polysĂ©mie au sein d’un domaine et entre les domaines. De plus, nous avons annotĂ© les cadres de nos domaines sur du texte continu, sans sĂ©lection d’occurrences : nous prĂ©servons ainsi la distribution des caractĂ©ristiques lexicales et syntaxiques de l’évocation des cadres dans notre corpus. Ă  l’heure actuelle, le FFN comporte 105 cadres et 873 dĂ©clencheurs distincts, qui donnent lieu Ă  1109 paires dĂ©clencheur-cadre distinctes, c’est-Ă -dire 1109 sens. Le corpus annotĂ© compte au total 16167 annotations de cadres de nos domaines et de leurs rĂŽles. La thĂšse commence par resituer le modĂšle FrameNet dans un contexte thĂ©orique plus large. Nous justifions ensuite le choix de nous appuyer sur cette ressource et motivons notre mĂ©thodologie en domaines notionnels. Nous explicitons pour le FFN certaines notions dĂ©finies pour le FrameNet de l’anglais que nous avons jugĂ©es trop floues pour ĂȘtre appliquĂ©es de maniĂšre cohĂ©rente. Nous introduisons en particulier des critĂšres plus directement syntaxiques pour la dĂ©finition du pĂ©rimĂštre lexical d’un cadre, ainsi que pour la distinction entre rĂŽles noyaux et non-noyaux.Nous dĂ©crivons ensuite la crĂ©ation du FFN : d’abord, la dĂ©limitation de la structure de cadres utilisĂ©e pour le FFN, et la crĂ©ation de leur lexique. Nous prĂ©sentons alors de maniĂšre approfondie le domaine notionnel des positions cognitives, qui englobe les cadres portant sur le degrĂ© de certitude d’un ĂȘtre douĂ© de conscience sur une proposition. Puis, nous prĂ©sentons notre mĂ©thodologie d’annotation du corpus en cadres et en rĂŽles. Ă  cette occasion, nous passons en revue certains phĂ©nomĂšnes linguistiques qu’il nous a fallu traiter pour obtenir une annotation cohĂ©rente ; c’est par exemple le cas des constructions Ă  attribut de l’objet.Enfin, nous prĂ©sentons des donnĂ©es quantitatives sur le FFN tel qu’il est Ă  ce jour et sur son Ă©valuation. Nous terminons sur des perspectives de travaux d’amĂ©lioration et d’exploitation de la ressource crĂ©Ă©e.This thesis describes the creation of the French FrameNet (FFN), a French language FrameNet type resource made using both the Berkeley FrameNet (Baker et al., 1998) and two morphosyntactic treebanks: the French Treebank (AbeillĂ© et al., 2003) and the Sequoia Treebank (Candito et Seddah, 2012). The Berkeley FrameNet allows for semantic annotation of prototypical situations and their participants. It consists of:a) a structured set of prototypical situations, called frames. These frames incorporate semantic characterizations of the situations’ participants (Frame Elements, or FEs);b) a lexicon of lexical units (LUs) which can evoke those frames;c) a set of English language frame annotations. In order to create the FFN, we designed a “domain by domain” methodology: we defined four “domains”, each centered on a specific notion (cause, verbal communication, cognitive stance, or commercial transaction). We then sought to obtain full frame and lexical coverage for these domains, and annotated the first 100 corpus occurrences of each LU in our domains. This strategy guarantees a greater consistency in terms of frame structuring than other approaches and is conducive to work on both intra-domain and inter-domains frame polysemy. Our annotating frames on continuous text without selecting particular LU occurrences preserves the natural distribution of lexical and syntactic characteristics of frame-evoking elements in our corpus. At the present time, the FFNincludes 105 distinct frames and 873 distinct LUs, which combine into 1,109 LU-frame pairs (i.e. 1,109 senses). 16,167 frame occurrences, as well as their FEs, have been annotated in our corpus. In this thesis, I first situate the FrameNet model in a larger theoretical background. I then justify our using the Berkeley FrameNet as our resource base and explain why we used a domain-by- domain methodology. I next try to clarify some specific BFN notions that we found too vague to be coherently used to make the FFN. Specifically, I introduce more directly syntactic criteria both for defining a frame’s lexical perimeter and for differentiating core FEs from non-core ones.Then, I describe the FFN creation itself first by delimitating a structure of frames that will be used in the resource and by creating a lexicon for these frames. I then introduce in detail the Cognitive Stances notional domain, which includes frames having to do with a cognizer’s degree of certainty about some particular content. Next, I describe our methodology for annotating a corpus with frames and FEs, and analyze our treatment of several specific linguistic phenomena that required additional consideration (such as object complement constructions).Finally, I give quantified information about the current status of the FFN and its evaluation. I conclude with some perspectives on improving and exploiting the FFN

    Stratégie domaine par domaine pour la création d'un FrameNet du français : annotations en corpus de cadres et rÎles sémantiques

    No full text
    This thesis describes the creation of the French FrameNet (FFN), a French language FrameNet type resource made using both the Berkeley FrameNet (Baker et al., 1998) and two morphosyntactic treebanks: the French Treebank (AbeillĂ© et al., 2003) and the Sequoia Treebank (Candito et Seddah, 2012). The Berkeley FrameNet allows for semantic annotation of prototypical situations and their participants. It consists of:a) a structured set of prototypical situations, called frames. These frames incorporate semantic characterizations of the situations’ participants (Frame Elements, or FEs);b) a lexicon of lexical units (LUs) which can evoke those frames;c) a set of English language frame annotations.In order to create the FFN, we designed a “domain by domain” methodology: we defined four “domains”, each centered on a specific notion (cause, verbal communication, cognitive stance, or commercial transaction). We then sought to obtain full frame and lexical coverage for these domains, and annotated the first 100 corpus occurrences of each LU in our domains. This strategy guarantees a greater consistency in terms of frame structuring than other approaches and is conducive to work on both intra-domain and inter-domains frame polysemy. Our annotating frames on continuous text without selecting particular LU occurrences preserves the natural distribution of lexical and syntactic characteristics of frame-evoking elements in our corpus. At the present time, the FFNincludes 105 distinct frames and 873 distinct LUs, which combine into 1,109 LU-frame pairs (i.e. 1,109 senses). 16,167 frame occurrences, as well as their FEs, have been annotated in our corpus.In this thesis, I first situate the FrameNet model in a larger theoretical background. I then justify our using the Berkeley FrameNet as our resource base and explain why we used a domain-by- domain methodology. I next try to clarify some specific BFN notions that we found too vague to be coherently used to make the FFN. Specifically, I introduce more directly syntactic criteria both for defining a frame’s lexical perimeter and for differentiating core FEs from non-core ones.Then, I describe the FFN creation itself first by delimitating a structure of frames that will be used in the resource and by creating a lexicon for these frames. I then introduce in detail the Cognitive Stances notional domain, which includes frames having to do with a cognizer’s degree of certainty about some particular content. Next, I describe our methodology for annotating a corpus with frames and FEs, and analyze our treatment of several specific linguistic phenomena that required additional consideration (such as object complement constructions).Finally, I give quantified information about the current status of the FFN and its evaluation. I conclude with some perspectives on improving and exploiting the FFN.Dans cette thĂšse, nous dĂ©crivons la crĂ©ation du French FrameNet (FFN), une ressource de type FrameNet pour le français crĂ©Ă©e Ă  partir du FrameNet de l’anglais (Baker et al., 1998) et de deux corpus arborĂ©s : le French Treebank (AbeillĂ© et al., 2003) et le Sequoia Treebank (Candito et Seddah, 2012). La ressource sĂ©minale, le FrameNet de l’anglais, constitue un modĂšle d’annotation sĂ©mantique de situations prototypiques et de leurs participants. Elle propose Ă  la fois :a) un ensemble structurĂ© de situations prototypiques, appelĂ©es cadres, associĂ©es Ă  des caractĂ©risations sĂ©mantiques des participants impliquĂ©s (les rĂŽles);b) un lexique de dĂ©clencheurs, les lexĂšmes Ă©voquant ces cadres;c) un ensemble d’annotations en cadres pour l’anglais.Pour crĂ©er le FFN, nous avons suivi une approche «par domaine notionnel» : nous avons dĂ©fini quatre «domaines» centrĂ©s chacun autour d’une notion (cause, communication langagiĂšre, position cognitive ou transaction commerciale), que nous avons travaillĂ© Ă  couvrir exhaustivement Ă  la fois pour la dĂ©finition des cadres sĂ©mantiques, la dĂ©finition du lexique, et l’annotation en corpus. Cette stratĂ©gie permet de garantir une plus grande cohĂ©rence dans la structuration en cadres sĂ©mantiques, tout en abordant la polysĂ©mie au sein d’un domaine et entre les domaines. De plus, nous avons annotĂ© les cadres de nos domaines sur du texte continu, sans sĂ©lection d’occurrences : nous prĂ©servons ainsi la distribution des caractĂ©ristiques lexicales et syntaxiques de l’évocation des cadres dans notre corpus. Ă  l’heure actuelle, le FFN comporte 105 cadres et 873 dĂ©clencheurs distincts, qui donnent lieu Ă  1109 paires dĂ©clencheur-cadre distinctes, c’est-Ă -dire 1109 sens. Le corpus annotĂ© compte au total 16167 annotations de cadres de nos domaines et de leurs rĂŽles.La thĂšse commence par resituer le modĂšle FrameNet dans un contexte thĂ©orique plus large. Nous justifions ensuite le choix de nous appuyer sur cette ressource et motivons notre mĂ©thodologie en domaines notionnels. Nous explicitons pour le FFN certaines notions dĂ©finies pour le FrameNet de l’anglais que nous avons jugĂ©es trop floues pour ĂȘtre appliquĂ©es de maniĂšre cohĂ©rente. Nous introduisons en particulier des critĂšres plus directement syntaxiques pour la dĂ©finition du pĂ©rimĂštre lexical d’un cadre, ainsi que pour la distinction entre rĂŽles noyaux et non-noyaux.Nous dĂ©crivons ensuite la crĂ©ation du FFN : d’abord, la dĂ©limitation de la structure de cadres utilisĂ©e pour le FFN, et la crĂ©ation de leur lexique. Nous prĂ©sentons alors de maniĂšre approfondie le domaine notionnel des positions cognitives, qui englobe les cadres portant sur le degrĂ© de certitude d’un ĂȘtre douĂ© de conscience sur une proposition. Puis, nous prĂ©sentons notre mĂ©thodologie d’annotation du corpus en cadres et en rĂŽles. Ă  cette occasion, nous passons en revue certains phĂ©nomĂšnes linguistiques qu’il nous a fallu traiter pour obtenir une annotation cohĂ©rente ; c’est par exemple le cas des constructions Ă  attribut de l’objet.Enfin, nous prĂ©sentons des donnĂ©es quantitatives sur le FFN tel qu’il est Ă  ce jour et sur son Ă©valuation. Nous terminons sur des perspectives de travaux d’amĂ©lioration et d’exploitation de la ressource crĂ©Ă©e
    corecore